Human CrkRS-related gene variant associated with lung cancers

ABSTRACT

The invention relates to the nucleic acid and polypeptide sequences of a novel human CrkRS-related gene variant.  
     The invention also provides the process for producing the polypeptide encoded by the variant.  
     The invention further provides the use of the nucleic acid and the polypeptide of the gene variant in diagnosing diseases, in particular, lung cancers.

FIELD OF THE INVENTION

[0001] The invention relates to the nucleic acid of a novel human CrkRs-related gene variant, the polypeptide encoded thereby, the preparation process thereof, and the uses of the same in diagnosing diseases associated with the deficiency of CrkRS gene, in particular, lung cancers.

BACKGROUND OF THE INVENTION

[0002] Lung cancer is one of the major causers of cancer-related deaths in the world. There are two primary types of lung cancers: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) (Carney, (1992a) Curr. Opin. Oncol. 4:292-8). Small cell lung cancer accounts for approximately 25% of lung cancer and spreads aggressively (Smyth et al. (1986) Q J Med. 61: 969-76; Carney, (1992b) Lancet 339: 843-6). Non-small cell lung cancer represents the majority (about 75%) of lung cancer and is further divided into three main subtypes: squamous cell carcinoma, adenocarcinoma, and large cell carcinoma (Ihde and Minna, (1991) Cancer 15: 105-54). In recent years, much progress has been made toward understanding the molecular and cellular biology of lung cancers. Many important contributions have been made by the identification of several key genetic factors associated with lung cancers. However, the treatments of lung cancers still mainly depend on surgery, chemotherapy, and radiotherapy. This is because the molecular mechanisms underlying the pathogenesis of lung cancers remain largely unclear.

[0003] A recent hypothesis suggested that lung cancer is caused by genetic mutations of at least 10 to 20 genes (Sethi, (1997) BMJ. 314: 652-655). Therefore, future strategies for the prevention and treatment of lung cancers will be focused on the elucidation of these genetic substrates, in particular, the genes localized on chromosome 17, a region shown to be associated with the development of lung cancer (Sameshima et al. (1990) Biochem Biophys Res Commun 173:697-703; Kato et al. (1993) Jpn J Cancer Res 84:355-9; Shimizu et al. (1993) Cancer 71:725-8; Levin et al. (1995) Genes Chromosomes Cancer 13:175-85; Abujiang et al. (1998) Oncogene 17:3029-33; Konishi et al. (1998) Oncogene 17:2095-100). Recently, CrkRS (a novel human protein kinase, Cdc2-related kinase with arginine/serine-rich domain) transcript mapped on this region (Ko et al. (2001) J Cell Sci 114:2591-603) raised a possibility that this novel gene may have a role in the tumorigenic process of lung cancer. Therefore, the discovery of gene variants of CrkRS may be important targets for diagnostic markers of lung cancer.

SUMMARY OF THE INVENTION

[0004] The present invention provides a CrkRS-related gene variant and the polypeptide encoded thereby as well as the fragments thereof. The nucleotide sequence of the gene variant and the polypeptide encoded thereby can be used for the diagnosis of diseases associated with CrkRS gene, in particular, lung cancers.

[0005] The invention further provides an expression vector and host cell for expressing the polypeptide of the invention.

[0006] The invention further provides a method for producing the polypeptide of the invention.

[0007] The invention further provides an antibody specifically binding to the polypeptide of the invention.

[0008] The invention also provides methods for diagnosing diseases associated with the deficiency of the human CrkRS gene, in particular, lung cancers.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009]FIG. 1 shows the nucleic acid sequence (SEQ ID NO:1) and amino acid sequence (SEQ ID NO:2) of CrkRSV.

[0010]FIG. 2 shows the nucleotide sequence alignment between the human CrkRS gene and CrkRSV.

[0011]FIG. 3 shows the amino acid sequence alignment between the human CrkRS protein and CrkRSV.

DETAILED DESCRIPTION OF THE INVENTION

[0012] According to the present invention, all technical and scientific terms used have the same meanings as commonly understood by persons skilled in the art.

[0013] The term “antibody” used herein denotes intact molecules (a polypeptide or group of polypeptides) as well as fragments thereof, such as Fab, R(ab′)₂, and Fv fragments, which are capable of binding the epitopic determinant. Antibodies are produced by specialized B cells after stimulation by an antigen. Structurally, antibody consists of four subunits including two heavy chains and two light chains. The internal surface shape and charge distribution of the antibody binding domain is complementary to the features of an antigen. Thus, antibody can specifically act against the antigen in an immune response.

[0014] The term “base pair (bp)” used herein denotes nucleotides composed of a purine on one strand of DNA which can be hydrogen bonded to a pyrimidine on the other strand. Thymine (or uracil) and adenine residues are linked by two hydrogen bonds. Cytosine and guanine residues are linked by three hydrogen bonds.

[0015] The term “Basic Local Alignment Search Tool (BLAST; Altschul et al., (1997) Nucleic Acids Res. 25: 3389-3402)” used herein denotes programs for evaluation of homologies between a query sequence (amino or nucleic acid) and a test sequence as described by Altschul et al. (Nucleic Acids Res. 25: 3389-3402, 1997). Specific BLAST programs are described as follows:

[0016] (1) BLASTN compares a nucleotide query sequence against a nucleotide sequence database;

[0017] (2) BLASTP compares an amino acid query sequence against a protein sequence database;

[0018] (3) BLASTX compares the six-frame conceptual translation products of a query nucleotide sequence against a protein sequence database;

[0019] (4) TBLASTN compares a query protein sequence against a nucleotide sequence database translated in all six reading frames; and

[0020] (5) TBLASTX compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.

[0021] The term “cDNA” used herein denotes nucleic acids that synthesized from a mRNA template using reverse transcriptase.

[0022] The term “cDNA library” used herein denotes a library composed of complementary DNAs which are reverse-transcribed from mRNAs.

[0023] The term “complement” used herein denotes a polynucleotide sequence capable of forming base pairing with another polynucleotide sequence. For example, the sequence 5′-ATGGACTTACT-3′ binds to the complementary sequence 5′-AGTAAGTCCAT-3′.

[0024] The term “deletion” used herein denotes a removal of a portion of one or more amino acid residues/nucleotides from a gene.

[0025] The term “expressed sequence tags (ESTs)” used herein denotes short (200 to 500 base pairs) nucleotide sequence that derives from either 5′ or 3′ end of a cDNA.

[0026] The term “expression vector” used herein denotes nucleic acid constructs which contain a cloning site for introducing the DNA into vector, one or more selectable markers for selecting vectors containing the DNA, an origin of replication for replicating the vector whenever the host cell divides, a terminator sequence, a polyadenylation signal, and a suitable control sequence which can effectively express the DNA in a suitable host. The suitable control sequence may include promoter, enhancer and other regulatory sequences necessary for directing polymerases to transcribe the DNA.

[0027] The term “host cell” used herein denotes a cell which is used to receive, maintain, and allow the reproduction of an expression vector comprising DNA. Host cells are transformed or transfected with suitable vectors constructed using recombinant DNA methods. The recombinant DNA introduced with the vector is replicated whenever the cell divides.

[0028] The term “insertion” or “addition” used herein denotes the addition of a portion of one or more amino acid residues/nucleotides to a gene.

[0029] The term “in silico” used herein denotes a process of using computational methods (e.g., BLAST) to analyze DNA sequences.

[0030] The term “polymerase chain reaction (PCR)” used herein denotes a method which increases the copy number of a nucleic acid sequence using a DNA polymerase and a set of primers (about 20 bp oligonucleotides complementary to each strand of DNA) under suitable conditions (successive rounds of primer annealing, strand elongation, and dissociation).

[0031] The term “protein” or “polypeptide” used herein denotes a sequence of amino acids in a specific order that can be encoded by a gene or by a recombinant DNA. It can also be chemically synthesized.

[0032] The term “nucleic acid sequence” or “polynucleotide” used herein denotes a sequence of nucleotide (guanine, cytosine, thymine or adenine) in a specific order that can be a natural or synthesized fragment of DNA or RNA. It may be single-stranded or double-stranded.

[0033] The term “reverse transcriptase-polymerase chain reaction (RT-PCR)” used herein denotes a process which transcribes mRNA to complementary DNA strand using reverse transcriptase followed by polymerase chain reaction to amplify the specific fragment of DNA sequences.

[0034] The term “transformation” used herein denotes a process describing the uptake, incorporation, and expression of exogenous DNA by prokaryotic host cells.

[0035] The term “transfection” used herein a process describing the uptake, incorporation, and expression of exogenous DNA by eukaryotic host cells.

[0036] The term “variant” used herein denotes a fragment of sequence (nucleotide or amino acid) inserted or deleted by one or more nucleotides/amino acids.

[0037] The present invention in the first aspect provides the polypeptide of a novel human CrkRS-related gene variant and the fragments thereof, as well as the nucleic acid sequences encoding the same.

[0038] According to the present invention, human CrkRS cDNA sequence was used to query the human lung EST databases (a normal lung, a large cell lung cancer, a squamous cell lung cancer and a small cell lung cancer) using BLAST program to search for CrkRS-related gene variants. Three ESTs showing similarity to CrkRS were identified in the large cell lung cancer, the squamous cell lung cancer and the SCLC databases, respectivrely. Their corresponding cDNA clones were found to be identical after sequencing and named CrkRSV (CrkRS variant). FIG. 1 shows the nucleic acid sequence of CrkRSV (SEQ ID NO: 1) and the amino acid sequences encoded thereby (SEQ ID NO: 2).

[0039] The full-length of the CrkRSV cDNA is a 5272 bp clone containing a 4293 bp open reading frame (ORF) extending from 34 bp to 4326 bp, which corresponds to an encoded protein of 1431 amino acid residues with a predicted molecular mass of 157.6 kDa. To determine the variation in sequence of CrkRSV cDNA clone, an alignment of CrkRS nucleotide/amino acid sequence with CrkRSV was performed (FIGS. 2 and 3). One major genetic deletion was found in the aligned sequences, showing that CrkRSV is a 177 bp deletion in the sequence of CrkRS from 1965-2141 bp. The lacking of 177 bp (59aa) is an in-frame deletion in the amino acid sequence of CrkRS and generates a polypeptide of 1431 amino acid residues of CrkRSV (FIG. 3).

[0040] In the present invention, a search of ESTs deposited in dbEST (Boguski et al. (1993) Nat Genet. 4: 332-3) at National Center of Biotechnology Information (NCBI) was performed to determine the tissue distribution of CrkRSV in silico. The result of in silico Northern analysis showed that one EST (GenBank Accession Number BF884818) was found to confirm the absence of 177 bp region on CrkRSV nucleotide sequence. This EST was also generated from a lung tumor cDNA library suggesting that the absence of 177 bp nucleotide fragment located between nucleotides 1964 to 1965 of CrkRSV may serve as a useful marker for diagnosing lung cancer. Therefore, any nucleotide fragments comprising nucleotides 1964 to 1965 of CrkRSV may be used as probes for determining the presence of CrkRSV under highly stringent conditions. An alternative approach is that any set of primers for amplifying the fragment containing nucleotides 1964 to 1965 of CrkRSV may be used for determining the presence of the variant.

[0041] According to the present invention, the polypeptide of the human CrkRSV and the fragments thereof may be produced via genetic engineering techniques. For instance, they may be produced by appropriate host cells which have been transformed by DNAs that code for the desired polypeptides or fragments thereof. The nucleotide sequence encoding the polypeptide of the human CrkRSV or the fragments thereof is inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for the transcription and translation of the inserted coding sequence in a suitable host. The nucleic acid sequence is inserted into the vector in a manner such that it will be expressed under appropriate conditions (e.g., in proper orientation and correct reading frame and with appropriate expression sequences, including an RNA polymerase binding sequence and a ribosomal binding sequence).

[0042] Any method that is known to those skilled in the art may be used to construct expression vectors containing sequences encoding the polypeptide of the human CrkRSV and appropriate transcriptional/translational control elements. These methods may include in vitro recombinant DNA and synthetic techniques, and in vivo genetic recombinants. (See, e.g., Sambrook, J. Cold Spring Harbor Press, Plainview N.Y., ch. 4, 8, and 16-17; Ausubel, R. M. et al. (1995) Current protocols in Molecular Biology, John Wiley & Sons, New York N.Y., ch. 9, 13, and 16.)

[0043] A variety of expression vector/host systems may be utilized to express the polypeptide-coding sequence. These include, but not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vector; yeast transformed with yeast expression vector; insect cell systems infected with virus (e.g., baculovirus); plant cell system transformed with viral expression vector (e.g., cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV); or animal cell system infected with virus (e.g., vaccina virus, adenovirus, etc.). Preferably, the host cell is a bacterium, and most preferably, the bacterium is E. coli.

[0044] Alternatively, the polypeptides of the human CrkRSV or the fragments thereof may be synthesized by using chemical methods. For example, peptide synthesis can be performed using various solid-phase techniques (Roberge, J. Y. et al. (1995) Science 269: 202 to 204). Automated synthesis may be achieved by using the ABI 431A peptide synthesizer (Perkin-Elmer).

[0045] According to the present invention, the fragments of the polypeptides and nucleic acid sequences of the human CrkRSV may be used as immunogens and primers or probes, respectively. Preferably, the purified fragments of the human CrkRSV are used. The fragments may be produced by enzymatic digestion, chemical cleavage of isolated or purified polypeptide or nucleic acid sequences, or chemical synthesis and then may be isolated or purified. Such isolated or purified fragments of the polypeptides and nucleic acid sequences can be directly used as immunogens and primers or probes, respectively.

[0046] The present invention further provides the antibodies which specifically bind one or more out-surface epitopes of the polypeptide of the human CrkRSV.

[0047] According to the present invention, immunization of mammals with immunogens described herein, preferably humans, rabbits, rats, mice, sheep, goats, cows, or horses, is performed following procedures well known to those skilled in the art, for the purpose of obtaining antisera containing polyclonal antibodies or hybridoma lines secreting monoclonal antibodies.

[0048] Monoclonal antibodies can be prepared by standard techniques, given the teachings contained herein. Such techniques are disclosed, for example, in U.S. Pat. Nos. 4,271,145 and 4,196,265. Briefly, an animal is immunized with the immunogen. Hybridomas are prepared by fusing spleen cells from the immunized animal with myeloma cells. The fusion products are screened for those producing antibodies that bind to the immunogen. The positive hybridoma clones are isolated, and the monoclonal antibodies are recovered from those clones.

[0049] Immunization regimens for production of both polyclonal and monoclonal antibodies are well-known in the art. The immunogen may be injected by any of a number of routes, including subcutaneous, intravenous, intraperitoneal, intradermal, intramuscular, mucosal, or a combination thereof. The immunogen may be injected in soluble form, aggregate form, attached to a physical carrier, or mixed with an adjuvant, using methods and materials well-known in the art. The antisera and antibodies may be purified using column chromatography methods well known to those skilled in the art.

[0050] According to the present invention, antibody fragments which contain specific binding sites for the polypeptides or fragments thereof may also be generated. For example, such fragments include, but are not limited to, F(ab′)₂ fragments produced by pepsin digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of the F(ab′)₂ fragments.

[0051] Many gene variants have been found to be associated with diseases (Stallings-Mann et al., (1996) Proc Natl Acad Sci USA 93: 12394-9; Liu et al., (1997) Nat Genet 16:328-9; Siffert et al., (1998) Nat Genet 18: 45 to 8; Lukas et al., (2001) Cancer Res 61: 3212 to 9). Since CrkRSV clone was isolated from lung cancers cDNA library and its expression in lung cancers was confirmed by in silico Northern analysis, it is advisable that CrkRSV may serve as markers for the diagnosis of diseases associated with the deficiency of CrkRS gene, in particular, lung cancers. Thus, the expression level of CrkRSV relative to CrkRS may be a useful indicator for screening of patients suspected of having such diseases and the index of relative expression level (mRNA or protein) may confer an increased susceptibility to the same.

[0052] Accordingly, the subject invention also provides methods for diagnosing diseases associated with the deficiency of CrkRS gene in a mammal, in particular, lung cancers.

[0053] The method for diagnosing the diseases associated with the deficiency of CrkRS gene may be performed by detecting the nucleotide sequence of the human CrkRSV of the invention which comprises the steps of: (1) extracting total RNA of cells obtained from a mammal; (2) amplifying the RNA by reverse transcriptase-polymerase chain reaction (RT-PCR) with a set of primers to obtain a cDNA comprising the fragments comprising nucleotides 1963 to 1968 of SEQ ID NO: 1; and (3) detecting whether the cDNA sample is obtained. If necessary, the amount of the obtained cDNA sample may be detected.

[0054] In the above embodiment, one of the primers may be designed to have a sequence comprising the nucleotides of SEQ ID NO: 1 containing nucleotides 1963 to 1968, and the other may be designed to have a sequence complementary to the nucleotides of SEQ ID NO: 1 at any other locations downstream of nucleotide 1968. Alternatively, one of the primers may be designed to have a sequence complementary to the nucleotides of SEQ ID NO: 1 containing nucleotides 1963 to 1968, and the other may be designed to have a sequence comprising the nucleotides of SEQ ID NO: 1 at any other locations upstream of nucleotide 1963. In this case, only CrkRSV will be amplified.

[0055] Alternatively, one of the primers may be designed to have a sequence comprising the nucleotides of SEQ ID NO: 1 upstream of nucleotide 1964 and the other may be designed to have a sequence complementary to the nucleotides of SEQ ID NO: 1 downstream of nucleotide 1965. Alternatively, one of the primers may be designed to have a sequence complementary to the nucleotides of SEQ ID NO: 1 upstream of nucleotide 1964 and the other may be designed to have a sequence comprising the nucleotides of SEQ ID NO: 1 downstream of nucleotide 1965. In this case, both CrkRS and CrkRSV will be amplified. The length of the PCR fragment from CrkRSV will be 177 bp shorter than that from CrkRS.

[0056] Preferably, the primer of the invention contains 15 to 30 nucleotides.

[0057] Total RNA may be isolated from patient samples by using TRIZOL reagents (Life Technology). Tissue samples (e.g., biopsy samples) are powdered under liquid nitrogen before homogenization. RNA purity and integrity are assessed by absorbance at 260/280 nm and by agarose gel electrophoresis. The set of primers designed to amplify the expected size of specific PCR fragments of CrkRSV can be used. PCR fragments are analyzed on a 1% agarose gel using five microliters (10%) of the amplified products. To determine the expression level of the gene variant, the intensity of the PCR products may be determined by using the Molecular Analyst program (version 1.4.1; Bio-Rad).

[0058] The RT-PCR experiment may be performed according to the manufacturer's instructions (Boehringer Mannheim). A 50 μl reaction mixture containing 2 l total RNA (0.1 μg/μl), 1 μl each primer (20 pM), 1 μl each dNTP (10 mM), 2.5 μl DTT solution (100 mM), 10 μl 5× RT-PCR buffer, 1 μl enzyme mixture, and 28.5 μl sterile distilled water may be subjected to the conditions such as reverse transcription at 60° C. for 30 minutes followed by 35 cycles of denaturation at 94° C. for 2 minutes, annealing at 60° C. for 2 minutes, and extension at 68° C. for 2 minutes. The RT-PCR analysis may be repeated twice to ensure reproducibility, for a total of three independent experiments.

[0059] Another embodiment for diagnosing the diseases associated with the deficiency of CrkRS gene may be performed by detecting the nucleotide sequences of the human CrkRSV of the invention which comprises the steps of: (1) extracting total RNA from a sample obtained from the mammal; (2) amplifying the RNA by reverse transcriptase-polymerase chain reaction (RT-PCR) to obtain a cDNA sample; (3) bringing the cDNA sample into contact with the nucleic acid of SEQ ID NO: 1 and the fragments thereof; and (4) detecting whether the cDNA sample hybridizes with the nucleic acid of SEQ ID NO: 1 or the fragments thereof. If necessary, the amount of hybridized sample may be detected.

[0060] The expression of gene variants can also be analyzed by using Northern Blot hybridization approach. Specific fragment comprising nucleotides 1963 to 1968 of the CrkRSV may be amplified by polymerase chain reaction (PCR) using primer set designed for RT-PCR. The amplified PCR fragment may be labeled and serve as a probe to hybridize the membranes containing total RNAs extracted from the samples under the conditions of 55° C. in a suitable hybridization solution for 3 hr. Blots may be washed twice in 2×SSC, 0.1% SDS at room temperature for 15 minutes each, followed by two washes in 0.1×SSC and 0.1% SDS at 65° C. for 20 minutes each. After these washes, blot may be rinsed briefly in suitable washing buffer and incubated in blocking solution for 30 minutes, and then incubated in suitable antibody solution for 30 minutes. Blots may be washed in washing buffer for 30 minutes and equilibrated in suitable detection buffer before detecting the signals. Alternatively, the presence of gene variants (cDNAs or PCR) can be detected using microarray approach. The cDNAs or PCR products corresponding to the nucleotide sequences of the present invention may be immobilized on a suitable substrate such as a glass slide. Hybridization can be preformed using the labeled mRNAs extracted from samples. After hybridization, nonhybridized mRNAs are removed. The relative abundance of each labeled transcript, hybridizing to a cDNA/PCR product immobilized on the microarray, can be determined by analyzing the scanned images.

[0061] According to the present invention, the method for diagnosing the diseases associated with the deficiency of CrkRS may also be performed by detecting the polypeptide encoded by the human CrkRSV of the invention. For instance, the polypeptide in protein samples obtained from the mammal may be determined by, but is not limited to, the immunoassay wherein the antibody specifically binding to the polypeptide of the invention is contacted with the protein samples, and the antibody-polypeptide complex is detected. If necessary, the amount of antibody-polypeptide complex can be determined.

[0062] The polypeptides encoded by CrkRSV may be expressed in prokaryotic cells by using suitable prokaryotic expression vectors. The cDNA fragments of CrkRSV gene encoding the amino acid coding sequence may be PCR amplified using primer set with restriction enzyme digestion sites incorporated in the 5′ and 3′ ends, respectively. The PCR products can then be enzyme digested, purified, and inserted into the corresponding sites of prokaryotic expression vector in-frame to generate recombinant plasmids. Sequence fidelity of this recombinant DNA can be verified by sequencing. The prokaryotic recombinant plasmids may be transformed into host cells (e.g., E coli BL21 (DE3)). Recombinant protein synthesis may be stimulated by the addition of 0.4 mM isopropylthiogalactoside (IPTG) for 3 h. The bacterially-expressed proteins may be purified.

[0063] The polypeptide of the gene variant may be expressed in animal cells by using eukaryotic expression vectors. Cells may be maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal bovine serum (FBS; Gibco BRL) at 37° C. in a humidified 5% Co₂ atmosphere. Before transfection, the nucleotide sequence of each of the gene variant may be amplified with PCR primers containing restriction enzyme digestion sites and ligated into the corresponding sites of eukaryotic expression vector in-frame. Sequence fidelity of this recombinant DNA can be verified by sequencing. The cells may be plated in 12-well plates one day before transfection at a density of 5×10⁴ cells per well. Transfections may be carried out using Lipofectamine Plus transfection reagent according to the manufacturer's instructions (Gibco BRL). Three hours following transfection, medium containing the complexes may be replaced with fresh medium. Forty-eight hours after incubation, the cells may be scraped into lysis buffer (0.1 M Tris HCl, pH 8.0, 0.1% Triton X-100) for purification of expressed proteins. After these proteins are purified, monoclonal antibodies against these purified proteins (CrkRSV) may be generated using hybridoma technique according to the conventional methods (de StGroth and Scheidegger, (1980) J Immunol Methods 35:1-21; Cote et al. (1983) Proc Natl Acad Sci USA 80: 2026-30; and Kozbor et al. (1985) J Immunol Methods 81:31-42).

[0064] According to the present invention, the presence of the polypeptides of the gene variant in samples obtained from the mammal suspected of having the diseases associated with the deficiency of CrkRS gene may be determined by, but not limited to, Western blot analysis. Proteins extracted from samples may be separated by SDS-PAGE and transferred to suitable membranes such as polyvinylidene difluoride (PVDF) in transfer buffer (25 mM Tris-HCl, pH 8.3, 192 mM glycine, 20% methanol) with a Trans-Blot apparatus for 1 h at 100 V (e.g., Bio-Rad). The proteins can be immunoblotted with specific antibodies. For example, membrane blotted with extracted proteins may be blocked with suitable buffers such as 3% solution of BSA or 3% solution of nonfat milk powder in TBST buffer (10 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.1% Tween 20) and incubated with monoclonal antibody directed against the polypeptides of gene variants. Unbound antibody is removed by washing with TBST for 5×1 minutes. Bound antibody may be detected using commercial ECL Western blotting detecting reagents.

[0065] The following examples are provided for illustration, but not for limiting the invention.

EXAMPLES Analysis of Human Lung EST Databases

[0066] Expressed sequence tags (ESTs) generated from the large-scale PCR-based sequencing of the 5′-end of human lung (normal, SCLC, squamous cell lung cancer and large cell lung cancer) cDNA clones were compiled and served as EST databases. Sequence comparisons against the nonredundant nucleotide and protein databases were performed using BLASTN and BLASTX programs (Altschul et al., (1997) Nucleic Acids Res. 25: 3389-3402; Gish and States, (1993) Nat Genet 3:266-272), at the National Center for Biotechnology Information (NCBI) with a significance cutoff of p<10⁻¹⁰. ESTs representing putative CrkRSV gene were identified during the course of EST generation.

Isolation of cDNA Clones

[0067] Three identical cDNA clone exhibiting EST sequences similar to the CrkRS gene were isolated from lung cancers cDNA library and named CrkRSV. The inserts of these clones were subsequently excised in vivo from the λZAP Express vector using the ExAssist/XLOLR helper phage system (Stratagene). Phagemid particles were excised by coinfecting XL1-BLUE MRF′ cells with ExAssist helper phage. The excised pBluescript phagemids were used to infect E. coli XLOLR cells, which lack the amber suppressor necessary for ExAssist phage replication. Infected XLOLR cells were selected using kanamycin resistance. Resultant colonies contained the double stranded phagemid vector with the cloned cDNA insert. A single colony was grown overnight in LB-kanamycin, and DNA was purified using a Qiagen plasmid purification kit.

Full Length Nucleotide Sequencing and Database Comparisons

[0068] Phagemid DNA was sequenced using the Epicentre#SE9101LC SequiTherm EXCEL™II DNA Sequencing Kit for 4200S-2 Global NEW IR² DNA sequencing system (LI-COR). Using the primer-walking approach, full-length sequence was determined. Nucleotide and protein searches were performed using BLAST against the non-redundant database of NCBI.

In Silico Tissue Distribution (Northern) Analysis

[0069] The coding sequence for each cDNA clones was searched against the dbEST sequence database (Boguski et al., (1993) Nat Genet. 4: 332-3) using the BLAST algorithm at the NCBI website. ESTs derived from each tissue were used as a source of information for transcript tissue expression analysis. Tissue distribution for each isolated cDNA clone was determined by ESTs matching to that particular sequence variants (insertions or deletions) with a significance cutoff of p<10⁻¹⁰.

REFERENCES

[0070] Abujiang et al. Loss of heterozygosity (LOH) at 17q and 14q in human lung cancers. Oncogene, 17:3029-33, (1998).

[0071] Altschul et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res, 25: 3389-3402, (1997).

[0072] Ausubel et al., Current protocols in Molecular Biology, John Wiley & Sons, New York N.Y., ch. 9, 13, and 16, (1995).

[0073] Boguski et al., dbEST—database for “expressed sequence tags”. Nat Genet. 4: 332-3, (1993).

[0074] Carney, The biology of lung cancer. Curr. Opin. Oncol. 4: 292-8, (1992a).

[0075] Carney, Biology of small-cell lung cancer. Lancet 339: 843-6, (1992b).

[0076] Cote et al., Generation of human monoclonal antibodies reactive with cellular antigens, Proc Natl Acad Sci USA 80: 2026-30 (1983).

[0077] de StGroth and Scheidegger, Production of monoclonal antibodies: strategy and tactics, J Immunol Methods 35:1-21, (1980).

[0078] Gish and States, Identification of protein coding regions by database similarity search, Nat Genet, 3:266-272, (1993).

[0079] Ihde and Minna, Non-small cell lung cancer. Part II: Treatment. Curr. Probl. Cancer 15: 105-54, (1991).

[0080] Kato et al. Establishment of a human small cell lung carcinoma cell line carrying amplification of c-myc gene and chromosomal translocation of t(3p;6p) and t(12q;17p). Jpn J Cancer Res, 84:355-9, (1993).

[0081] Ko et al., CrkRS: a novel conserved Cdc2-related protein kinase that colocalises with SC35 speckles. J Cell Sci, 114:2591-603, (2001).

[0082] Konishi et al. Detailed deletion mapping suggests the involvement of a tumor suppressor gene at 17p13.3, distal to p53, in the pathogenesis of lung cancers. Oncogene, 17:2095-100, (1998).

[0083] Kozbor et al., Specific immunoglobulin production and enhanced tumorigenicity following ascites growth of human hybridomas, J Immunol Methods, 81:31-42 (1985).

[0084] Levin et al. Identification of novel regions of altered DNA copy number in small cell lung tumors. Genes Chromosomes Cancer, 13:175-85, (1995).

[0085] Liu et al., Silent mutation induces exon skipping of fibrillin-1 gene in Marfan syndrome. Nat Genet 16:328-9, (1997).

[0086] Lukas et al., Alternative and aberrant messenger RNA splicing of the mdm2 oncogene in invasive breast cancer. Cancer Res 61:3212-9, (2001).

[0087] Roberge et al., A strategy for a convergent synthesis of N-linked glycopeptides on a solid support. Science 269:202-4, (1995).

[0088] Sambrook, J. Cold Spring Harbor Press, Plainview N.Y., ch. 4, 8, and 16-17.

[0089] Sameshima et al. Point mutation of the p53 gene resulting in splicing inhibition in small cell lung carcinoma. Biochem Biophys Res Commun, 173:697-703, (1990).

[0090] Sethi, Science, medicine, and the future. Lung cancer, BMJ, 314: 652-655, (1997)

[0091] Shimizu et al. Loss of heterozygosity on chromosome arm 17p in small cell lung carcinomas, but not in neurofibromas, in a patient with von Recklinghausen neurofibromatosis. Cancer, 71:725-8, (1993).

[0092] Siffert et al., Association of a human G-protein beta3 subunit variant with hypertension. Nat Genet, 18:45-8, (1998).

[0093] Simpson, A. J. G. EST Accession No. BF884818

[0094] Smyth et al., The impact of chemotherapy on small cell carcinoma of the bronchus. Q J Med, 61: 969-76, (1986).

[0095] Stallings-Mann et al., Alternative splicing of exon 3 of the human growth hormone receptor is the result of an unusual genetic polymorphism. Proc Natl Acad Sci USA 93:12394-9, (1996).

[0096]

1 2 1 5272 DNA Homo sapiens CDS (34)..(4326) 1 cttttttccc ttcttcaggt caggggaaag gga atg ccc aat tca gag aga cat 54 Met Pro Asn Ser Glu Arg His 1 5 ggg ggc aag aag gac ggg agt gga gga gct tct gga act ttg cag ccg 102 Gly Gly Lys Lys Asp Gly Ser Gly Gly Ala Ser Gly Thr Leu Gln Pro 10 15 20 tca tcg gga ggc ggc agc tct aac agc aga gag cgt cac cgc ttg gta 150 Ser Ser Gly Gly Gly Ser Ser Asn Ser Arg Glu Arg His Arg Leu Val 25 30 35 tcg aag cac aag cgg cat aag tcc aaa cac tcc aaa gac atg ggg ttg 198 Ser Lys His Lys Arg His Lys Ser Lys His Ser Lys Asp Met Gly Leu 40 45 50 55 gtg acc ccc gaa gca gca tcc ctg ggc aca gtt atc aaa cct ttg gtg 246 Val Thr Pro Glu Ala Ala Ser Leu Gly Thr Val Ile Lys Pro Leu Val 60 65 70 gag tat gat gat atc agc tct gat tcc gac acc ttc tcc gat gac atg 294 Glu Tyr Asp Asp Ile Ser Ser Asp Ser Asp Thr Phe Ser Asp Asp Met 75 80 85 gcc ttc aaa cta gac cga agg gag aac gac gaa cgt cgt gga tca gat 342 Ala Phe Lys Leu Asp Arg Arg Glu Asn Asp Glu Arg Arg Gly Ser Asp 90 95 100 cgg agc gac cgc ctg cac aaa cat cgt cac cac cag cac agg cgt tcc 390 Arg Ser Asp Arg Leu His Lys His Arg His His Gln His Arg Arg Ser 105 110 115 cgg gac tta cta aaa gct aaa cag acc gaa aaa gaa aaa agc caa gaa 438 Arg Asp Leu Leu Lys Ala Lys Gln Thr Glu Lys Glu Lys Ser Gln Glu 120 125 130 135 gtc tcc agc aag tcg gga tcg atg aag gac cgg ata tcg gga agt tca 486 Val Ser Ser Lys Ser Gly Ser Met Lys Asp Arg Ile Ser Gly Ser Ser 140 145 150 aag cgt tcg aat gag gag act gat gac tat ggg aag gcg cag gta gcc 534 Lys Arg Ser Asn Glu Glu Thr Asp Asp Tyr Gly Lys Ala Gln Val Ala 155 160 165 aaa agc agc agc aag gaa tcc agg tca tcc aag ctc cac aag gag aag 582 Lys Ser Ser Ser Lys Glu Ser Arg Ser Ser Lys Leu His Lys Glu Lys 170 175 180 acc agg aaa gaa cgg gag ctg aag tct ggg cac aaa gac cgg agt aaa 630 Thr Arg Lys Glu Arg Glu Leu Lys Ser Gly His Lys Asp Arg Ser Lys 185 190 195 agt cat cga aaa agg gaa aca ccc aaa agt tac aaa aca gtg gac agc 678 Ser His Arg Lys Arg Glu Thr Pro Lys Ser Tyr Lys Thr Val Asp Ser 200 205 210 215 cca aaa cgg aga tcc agg agc ccc cac agg aag tgg tct gac agc tcc 726 Pro Lys Arg Arg Ser Arg Ser Pro His Arg Lys Trp Ser Asp Ser Ser 220 225 230 aaa caa gat gat agc ccc tcg gga gct tct tat ggc caa gat tat gac 774 Lys Gln Asp Asp Ser Pro Ser Gly Ala Ser Tyr Gly Gln Asp Tyr Asp 235 240 245 ctt agt ccc tca cga tct cat acc tcg agc aat tat gac tcc tac aag 822 Leu Ser Pro Ser Arg Ser His Thr Ser Ser Asn Tyr Asp Ser Tyr Lys 250 255 260 aaa agt cct gga agt acc tcg aga agg cag tcg gtc agt ccc cct tac 870 Lys Ser Pro Gly Ser Thr Ser Arg Arg Gln Ser Val Ser Pro Pro Tyr 265 270 275 aag gag cct tcg gcc tac cag tcc agc acc cgg tca ccg agc ccc tac 918 Lys Glu Pro Ser Ala Tyr Gln Ser Ser Thr Arg Ser Pro Ser Pro Tyr 280 285 290 295 agt agg cga cag aga tct gtc agt ccc tat agc agg aga cgg tcg tcc 966 Ser Arg Arg Gln Arg Ser Val Ser Pro Tyr Ser Arg Arg Arg Ser Ser 300 305 310 agc tac gaa aga agt ggc tct tac agc ggg cga tcg ccc agt ccc tat 1014 Ser Tyr Glu Arg Ser Gly Ser Tyr Ser Gly Arg Ser Pro Ser Pro Tyr 315 320 325 ggt cga agg cgg tcc agc agc cct ttc ctg agc aag cgg tct ctg agt 1062 Gly Arg Arg Arg Ser Ser Ser Pro Phe Leu Ser Lys Arg Ser Leu Ser 330 335 340 cgg agt cca ctc ccc agt agg aaa tcc atg aag tcc aga agt aga agt 1110 Arg Ser Pro Leu Pro Ser Arg Lys Ser Met Lys Ser Arg Ser Arg Ser 345 350 355 cct gca tat tca aga cat tca tct tct cat agt aaa aag aag aga tcc 1158 Pro Ala Tyr Ser Arg His Ser Ser Ser His Ser Lys Lys Lys Arg Ser 360 365 370 375 agt tca cgc agt cgt cat tcc agt atc tca cct gtc agg ctt cca ctt 1206 Ser Ser Arg Ser Arg His Ser Ser Ile Ser Pro Val Arg Leu Pro Leu 380 385 390 aat tcc agt ctg gga gct gaa ctc agt agg aaa aag aag gaa aga gca 1254 Asn Ser Ser Leu Gly Ala Glu Leu Ser Arg Lys Lys Lys Glu Arg Ala 395 400 405 gct gct gct gct gca gca aag atg gat gga aag gag tcc aag ggt tca 1302 Ala Ala Ala Ala Ala Ala Lys Met Asp Gly Lys Glu Ser Lys Gly Ser 410 415 420 cct gta ttt ttg cct aga aaa gag aac agt tca gta gag gct aag gat 1350 Pro Val Phe Leu Pro Arg Lys Glu Asn Ser Ser Val Glu Ala Lys Asp 425 430 435 tca ggt ttg gag tct aaa aag tta ccc aga agt gta aaa ttg gaa aaa 1398 Ser Gly Leu Glu Ser Lys Lys Leu Pro Arg Ser Val Lys Leu Glu Lys 440 445 450 455 tct gcc cca gat act gaa ctg gtg aat gta aca cat cta aac aca gag 1446 Ser Ala Pro Asp Thr Glu Leu Val Asn Val Thr His Leu Asn Thr Glu 460 465 470 gta aaa aat tct tca gat aca ggg aaa gta aag ttg gat gag aac tcc 1494 Val Lys Asn Ser Ser Asp Thr Gly Lys Val Lys Leu Asp Glu Asn Ser 475 480 485 gag aag cat ctt gtt aaa gat ttg aaa gca cag gga aca aga gac tct 1542 Glu Lys His Leu Val Lys Asp Leu Lys Ala Gln Gly Thr Arg Asp Ser 490 495 500 aaa ccc ata gca ctg aaa gag gag att gtt act cca aag gag aca gaa 1590 Lys Pro Ile Ala Leu Lys Glu Glu Ile Val Thr Pro Lys Glu Thr Glu 505 510 515 aca tca gaa aag gag acc cct cca cct ctt ccc aca att gct tct ccc 1638 Thr Ser Glu Lys Glu Thr Pro Pro Pro Leu Pro Thr Ile Ala Ser Pro 520 525 530 535 cca ccc cct cta cca act act acc cct cca cct cag aca ccc cct ttg 1686 Pro Pro Pro Leu Pro Thr Thr Thr Pro Pro Pro Gln Thr Pro Pro Leu 540 545 550 cca cct ttg cct cca ata cca gct ctt cca cag caa cca cct ctg cct 1734 Pro Pro Leu Pro Pro Ile Pro Ala Leu Pro Gln Gln Pro Pro Leu Pro 555 560 565 cct tct cag cca gca ttt agt cag gtt cct gct tcc agt act tca act 1782 Pro Ser Gln Pro Ala Phe Ser Gln Val Pro Ala Ser Ser Thr Ser Thr 570 575 580 ttg ccc cct tct act cac tca aag aca tct gct gtg tcc tct cag gca 1830 Leu Pro Pro Ser Thr His Ser Lys Thr Ser Ala Val Ser Ser Gln Ala 585 590 595 aat tct cag ccc cct gta cag gtt tct gtg aag act caa gta tct gta 1878 Asn Ser Gln Pro Pro Val Gln Val Ser Val Lys Thr Gln Val Ser Val 600 605 610 615 aca gct gct att cca cac ctg aaa act tca acg ttg cct cct ttg ccc 1926 Thr Ala Ala Ile Pro His Leu Lys Thr Ser Thr Leu Pro Pro Leu Pro 620 625 630 ctc cca ccc tta tta cct gga ggt gat gac atg gat aga att tgt tgt 1974 Leu Pro Pro Leu Leu Pro Gly Gly Asp Asp Met Asp Arg Ile Cys Cys 635 640 645 cct cgt tat gga gaa aga aga caa aca gaa agc gac tgg ggg aaa cgc 2022 Pro Arg Tyr Gly Glu Arg Arg Gln Thr Glu Ser Asp Trp Gly Lys Arg 650 655 660 tgt gtg gac aag ttt gac att att ggg att att gga gaa gga acc tat 2070 Cys Val Asp Lys Phe Asp Ile Ile Gly Ile Ile Gly Glu Gly Thr Tyr 665 670 675 ggc caa gta tat aaa gcc agg gac aaa gac aca gga gaa cta gtg gct 2118 Gly Gln Val Tyr Lys Ala Arg Asp Lys Asp Thr Gly Glu Leu Val Ala 680 685 690 695 ctg aag aag gtg aga cta gac aat gag aaa gag ggc ttc cca atc aca 2166 Leu Lys Lys Val Arg Leu Asp Asn Glu Lys Glu Gly Phe Pro Ile Thr 700 705 710 gcc att cgt gaa atc aaa atc ctt cgt cag tta atc cac cga agt gtt 2214 Ala Ile Arg Glu Ile Lys Ile Leu Arg Gln Leu Ile His Arg Ser Val 715 720 725 gtt aac atg aag gaa att gtc aca gat aaa caa gat gca ctg gat ttc 2262 Val Asn Met Lys Glu Ile Val Thr Asp Lys Gln Asp Ala Leu Asp Phe 730 735 740 aag aag gac aaa ggt gcc ttt tac ctt gta ttt gag tat atg gac cat 2310 Lys Lys Asp Lys Gly Ala Phe Tyr Leu Val Phe Glu Tyr Met Asp His 745 750 755 gac tta atg gga ctg cta gaa tct ggt ttg gtg cac ttt tct gag gac 2358 Asp Leu Met Gly Leu Leu Glu Ser Gly Leu Val His Phe Ser Glu Asp 760 765 770 775 cat atc aag tcg ttc atg aaa cag cta atg gaa gga ttg gaa tac tgt 2406 His Ile Lys Ser Phe Met Lys Gln Leu Met Glu Gly Leu Glu Tyr Cys 780 785 790 cac aaa aag aat ttc ctg cat cgg gat att aag tgt tct aac att ttg 2454 His Lys Lys Asn Phe Leu His Arg Asp Ile Lys Cys Ser Asn Ile Leu 795 800 805 ctg aat aac agt ggg caa atc aaa cta gca gat ttt gga ctt gct cgg 2502 Leu Asn Asn Ser Gly Gln Ile Lys Leu Ala Asp Phe Gly Leu Ala Arg 810 815 820 ctc tat aac tct gaa gag agt cgc cct tac aca aac aaa gtc att act 2550 Leu Tyr Asn Ser Glu Glu Ser Arg Pro Tyr Thr Asn Lys Val Ile Thr 825 830 835 ttg tgg tac cga cct cca gaa cta ctg cta gga gag gaa cgt tac aca 2598 Leu Trp Tyr Arg Pro Pro Glu Leu Leu Leu Gly Glu Glu Arg Tyr Thr 840 845 850 855 cca gcc ata gat gtt tgg agc tgt gga tgt att ctt ggg gaa cta ttc 2646 Pro Ala Ile Asp Val Trp Ser Cys Gly Cys Ile Leu Gly Glu Leu Phe 860 865 870 aca aag aag cct att ttt caa gcc aat ctg gaa ctg gct cag cta gaa 2694 Thr Lys Lys Pro Ile Phe Gln Ala Asn Leu Glu Leu Ala Gln Leu Glu 875 880 885 ctg atc agc cga ctt tgt ggt agc cct tgt cca gct gtg tgg cct gat 2742 Leu Ile Ser Arg Leu Cys Gly Ser Pro Cys Pro Ala Val Trp Pro Asp 890 895 900 gtt atc aaa ctg ccc tac ttc aac acc atg aaa ccg aag aag caa tat 2790 Val Ile Lys Leu Pro Tyr Phe Asn Thr Met Lys Pro Lys Lys Gln Tyr 905 910 915 cga agg cgt cta cga gaa gaa ttc tct ttc att cct tct gca gca ctt 2838 Arg Arg Arg Leu Arg Glu Glu Phe Ser Phe Ile Pro Ser Ala Ala Leu 920 925 930 935 gat tta ttg gac cac atg ctg aca cta gat cct agt aag cgg tgc aca 2886 Asp Leu Leu Asp His Met Leu Thr Leu Asp Pro Ser Lys Arg Cys Thr 940 945 950 gct gaa cag acc cta cag agc gac ttc ctt aaa gat gtc gaa ctc agc 2934 Ala Glu Gln Thr Leu Gln Ser Asp Phe Leu Lys Asp Val Glu Leu Ser 955 960 965 aaa atg gct cct cca gac ctc ccc cac tgg cag gat tgc cat gag ttg 2982 Lys Met Ala Pro Pro Asp Leu Pro His Trp Gln Asp Cys His Glu Leu 970 975 980 tgg agt aag aaa cgg cga cgt cag cga caa agt ggt gtt gta gtc gaa 3030 Trp Ser Lys Lys Arg Arg Arg Gln Arg Gln Ser Gly Val Val Val Glu 985 990 995 gag cca cct cca tcc aaa act tct cga aaa gaa act acc tca ggg 3075 Glu Pro Pro Pro Ser Lys Thr Ser Arg Lys Glu Thr Thr Ser Gly 1000 1005 1010 aca agt act gag cct gtg aag aac agc agc cca gca cca cct cag 3120 Thr Ser Thr Glu Pro Val Lys Asn Ser Ser Pro Ala Pro Pro Gln 1015 1020 1025 cct gct cct ggc aag gtg gag tct ggg gct ggg gat gca ata ggc 3165 Pro Ala Pro Gly Lys Val Glu Ser Gly Ala Gly Asp Ala Ile Gly 1030 1035 1040 ctt gct gac atc aca caa cag ctg aat caa agt gaa ttg gca gtg 3210 Leu Ala Asp Ile Thr Gln Gln Leu Asn Gln Ser Glu Leu Ala Val 1045 1050 1055 tta tta aac ctg ctg cag agc caa acc gac ctg agc atc cct caa 3255 Leu Leu Asn Leu Leu Gln Ser Gln Thr Asp Leu Ser Ile Pro Gln 1060 1065 1070 atg gca cag ctg ctt aac atc cac tcc aac cca gag atg cag cag 3300 Met Ala Gln Leu Leu Asn Ile His Ser Asn Pro Glu Met Gln Gln 1075 1080 1085 cag ctg gaa gcc ctg aac caa tcc atc agt gcc ctg acg gaa gct 3345 Gln Leu Glu Ala Leu Asn Gln Ser Ile Ser Ala Leu Thr Glu Ala 1090 1095 1100 act tcc cag cag cag gac tca gag acc atg gcc cca gag gag tct 3390 Thr Ser Gln Gln Gln Asp Ser Glu Thr Met Ala Pro Glu Glu Ser 1105 1110 1115 ttg aag gaa gca ccc tct gcc cca gtg atc ctg cct tca gca gaa 3435 Leu Lys Glu Ala Pro Ser Ala Pro Val Ile Leu Pro Ser Ala Glu 1120 1125 1130 cag atg acc ctt gaa gct tca agc aca cca gct gac atg cag aat 3480 Gln Met Thr Leu Glu Ala Ser Ser Thr Pro Ala Asp Met Gln Asn 1135 1140 1145 ata ttg gca gtt ctc ttg agt cag ctg atg aaa acc caa gag cca 3525 Ile Leu Ala Val Leu Leu Ser Gln Leu Met Lys Thr Gln Glu Pro 1150 1155 1160 gca ggc agt ctg gag gaa aac aac agt gac aag aac agt ggg cca 3570 Ala Gly Ser Leu Glu Glu Asn Asn Ser Asp Lys Asn Ser Gly Pro 1165 1170 1175 cag ggg ccc cga aga act ccc aca atg cca cag gag gag gca gca 3615 Gln Gly Pro Arg Arg Thr Pro Thr Met Pro Gln Glu Glu Ala Ala 1180 1185 1190 gca tgt cct cct cac att ctt cca cca gag aag agg ccc cct gag 3660 Ala Cys Pro Pro His Ile Leu Pro Pro Glu Lys Arg Pro Pro Glu 1195 1200 1205 ccc ccc gga cct cca ccg ccg cca cct cca ccc cct ctg gtt gaa 3705 Pro Pro Gly Pro Pro Pro Pro Pro Pro Pro Pro Pro Leu Val Glu 1210 1215 1220 ggc gat ctt tcc agc gcc ccc cag gag ttg aac cca gcc gtg aca 3750 Gly Asp Leu Ser Ser Ala Pro Gln Glu Leu Asn Pro Ala Val Thr 1225 1230 1235 gcc gcc ttg ctg caa ctt tta tcc cag cct gaa gca gag cct cct 3795 Ala Ala Leu Leu Gln Leu Leu Ser Gln Pro Glu Ala Glu Pro Pro 1240 1245 1250 ggc cac ctg cca cat gag cac cag gcc ttg aga cca atg gag tac 3840 Gly His Leu Pro His Glu His Gln Ala Leu Arg Pro Met Glu Tyr 1255 1260 1265 tcc acc cga ccc cgt cca aac agg act tat gga aac act gat ggg 3885 Ser Thr Arg Pro Arg Pro Asn Arg Thr Tyr Gly Asn Thr Asp Gly 1270 1275 1280 cct gaa aca ggg ttc agt gcc att gac act gat gaa cga aac tct 3930 Pro Glu Thr Gly Phe Ser Ala Ile Asp Thr Asp Glu Arg Asn Ser 1285 1290 1295 ggt cca gcc ttg aca gaa tcc ttg gtc cag acc ctg gtg aag aac 3975 Gly Pro Ala Leu Thr Glu Ser Leu Val Gln Thr Leu Val Lys Asn 1300 1305 1310 agg acc ttc tca ggc tct ctg agc cac ctt ggg gag tcc agc agt 4020 Arg Thr Phe Ser Gly Ser Leu Ser His Leu Gly Glu Ser Ser Ser 1315 1320 1325 tac cag ggc aca ggg tca gtg cag ttt cca ggg gac cag gac ctc 4065 Tyr Gln Gly Thr Gly Ser Val Gln Phe Pro Gly Asp Gln Asp Leu 1330 1335 1340 cgt ttt gcc agg gtc ccc tta gcg tta cac ccg gtg gtc ggg caa 4110 Arg Phe Ala Arg Val Pro Leu Ala Leu His Pro Val Val Gly Gln 1345 1350 1355 cca ttc ctg aag gct gag gga agc agc aat tct gtg gta cat gca 4155 Pro Phe Leu Lys Ala Glu Gly Ser Ser Asn Ser Val Val His Ala 1360 1365 1370 gag acc aaa ttg caa aac tat ggg gag ctg ggg cca gga acc act 4200 Glu Thr Lys Leu Gln Asn Tyr Gly Glu Leu Gly Pro Gly Thr Thr 1375 1380 1385 ggg gcc agc agc tca gga gca ggc ctt cac tgg ggg ggc cca act 4245 Gly Ala Ser Ser Ser Gly Ala Gly Leu His Trp Gly Gly Pro Thr 1390 1395 1400 cag tct tct gct tat gga aaa ctc tat cgg ggg cct aca aga gtc 4290 Gln Ser Ser Ala Tyr Gly Lys Leu Tyr Arg Gly Pro Thr Arg Val 1405 1410 1415 cca cca aga ggg gga aga ggg aga gga gtt cct tac taacccagag 4336 Pro Pro Arg Gly Gly Arg Gly Arg Gly Val Pro Tyr 1420 1425 1430 acttcagtgt cctgaaagat tcctttccta tccatccttc catccagttc tctgaatctt 4396 taatgaaatc atttgccaga gcgaggtaat catctgcatt tggctactgc aaagctgtcc 4456 gttgtattcc ttgctcactt gctactagca ggcgacttag gaaataatga tgttggcacc 4516 agttccccct ggatgggcta tagccagaac atttacttca actctacctt agtagataca 4576 agtagagaat atggagagga tcattacatt gaaaagtaaa tgttttatta gttcattgcc 4636 tgcacttact ggtcggaaga gagaaagaac agtttcagta ttgagatggc tcaggagagg 4696 ctctttgatt tttaaagttt tggggtgggg ggttgtgtgt ggtttctttc ttttgaattt 4756 taatttaggt gttttgggtt tttttccttt aaagagaata gtgttcacaa aatttgagct 4816 gctctttggc ttttgctata agggaaacag agtggcctgg ctgatttgaa taaatgtttc 4876 tttcctctcc accatctcac attttgcttt taagtgaaca ctttttcccc attgagcatc 4936 ttgaacatac tttttttcca aataaattac tcatccttaa agtttactcc actttgacaa 4996 aagatacgcc cttctccctg cacataaagc aggttgtaga acgtggcatt cttgggcaag 5056 taggtagact ttacccagtc tctttccttt tttgctgatg tgtgctctct ctctctcttt 5116 ctctctctct ctctctctct ctctctctct gtctgtctcg cttgctcgct ctcgctgttt 5176 ctctctcttt gaggcatttg tttggaaaaa atcgttgaga tgcccaagaa cctgggataa 5236 ttctttactt tttttgaaat aaaggaaagg aaattc 5272 2 1431 PRT Homo sapiens 2 Met Pro Asn Ser Glu Arg His Gly Gly Lys Lys Asp Gly Ser Gly Gly 1 5 10 15 Ala Ser Gly Thr Leu Gln Pro Ser Ser Gly Gly Gly Ser Ser Asn Ser 20 25 30 Arg Glu Arg His Arg Leu Val Ser Lys His Lys Arg His Lys Ser Lys 35 40 45 His Ser Lys Asp Met Gly Leu Val Thr Pro Glu Ala Ala Ser Leu Gly 50 55 60 Thr Val Ile Lys Pro Leu Val Glu Tyr Asp Asp Ile Ser Ser Asp Ser 65 70 75 80 Asp Thr Phe Ser Asp Asp Met Ala Phe Lys Leu Asp Arg Arg Glu Asn 85 90 95 Asp Glu Arg Arg Gly Ser Asp Arg Ser Asp Arg Leu His Lys His Arg 100 105 110 His His Gln His Arg Arg Ser Arg Asp Leu Leu Lys Ala Lys Gln Thr 115 120 125 Glu Lys Glu Lys Ser Gln Glu Val Ser Ser Lys Ser Gly Ser Met Lys 130 135 140 Asp Arg Ile Ser Gly Ser Ser Lys Arg Ser Asn Glu Glu Thr Asp Asp 145 150 155 160 Tyr Gly Lys Ala Gln Val Ala Lys Ser Ser Ser Lys Glu Ser Arg Ser 165 170 175 Ser Lys Leu His Lys Glu Lys Thr Arg Lys Glu Arg Glu Leu Lys Ser 180 185 190 Gly His Lys Asp Arg Ser Lys Ser His Arg Lys Arg Glu Thr Pro Lys 195 200 205 Ser Tyr Lys Thr Val Asp Ser Pro Lys Arg Arg Ser Arg Ser Pro His 210 215 220 Arg Lys Trp Ser Asp Ser Ser Lys Gln Asp Asp Ser Pro Ser Gly Ala 225 230 235 240 Ser Tyr Gly Gln Asp Tyr Asp Leu Ser Pro Ser Arg Ser His Thr Ser 245 250 255 Ser Asn Tyr Asp Ser Tyr Lys Lys Ser Pro Gly Ser Thr Ser Arg Arg 260 265 270 Gln Ser Val Ser Pro Pro Tyr Lys Glu Pro Ser Ala Tyr Gln Ser Ser 275 280 285 Thr Arg Ser Pro Ser Pro Tyr Ser Arg Arg Gln Arg Ser Val Ser Pro 290 295 300 Tyr Ser Arg Arg Arg Ser Ser Ser Tyr Glu Arg Ser Gly Ser Tyr Ser 305 310 315 320 Gly Arg Ser Pro Ser Pro Tyr Gly Arg Arg Arg Ser Ser Ser Pro Phe 325 330 335 Leu Ser Lys Arg Ser Leu Ser Arg Ser Pro Leu Pro Ser Arg Lys Ser 340 345 350 Met Lys Ser Arg Ser Arg Ser Pro Ala Tyr Ser Arg His Ser Ser Ser 355 360 365 His Ser Lys Lys Lys Arg Ser Ser Ser Arg Ser Arg His Ser Ser Ile 370 375 380 Ser Pro Val Arg Leu Pro Leu Asn Ser Ser Leu Gly Ala Glu Leu Ser 385 390 395 400 Arg Lys Lys Lys Glu Arg Ala Ala Ala Ala Ala Ala Ala Lys Met Asp 405 410 415 Gly Lys Glu Ser Lys Gly Ser Pro Val Phe Leu Pro Arg Lys Glu Asn 420 425 430 Ser Ser Val Glu Ala Lys Asp Ser Gly Leu Glu Ser Lys Lys Leu Pro 435 440 445 Arg Ser Val Lys Leu Glu Lys Ser Ala Pro Asp Thr Glu Leu Val Asn 450 455 460 Val Thr His Leu Asn Thr Glu Val Lys Asn Ser Ser Asp Thr Gly Lys 465 470 475 480 Val Lys Leu Asp Glu Asn Ser Glu Lys His Leu Val Lys Asp Leu Lys 485 490 495 Ala Gln Gly Thr Arg Asp Ser Lys Pro Ile Ala Leu Lys Glu Glu Ile 500 505 510 Val Thr Pro Lys Glu Thr Glu Thr Ser Glu Lys Glu Thr Pro Pro Pro 515 520 525 Leu Pro Thr Ile Ala Ser Pro Pro Pro Pro Leu Pro Thr Thr Thr Pro 530 535 540 Pro Pro Gln Thr Pro Pro Leu Pro Pro Leu Pro Pro Ile Pro Ala Leu 545 550 555 560 Pro Gln Gln Pro Pro Leu Pro Pro Ser Gln Pro Ala Phe Ser Gln Val 565 570 575 Pro Ala Ser Ser Thr Ser Thr Leu Pro Pro Ser Thr His Ser Lys Thr 580 585 590 Ser Ala Val Ser Ser Gln Ala Asn Ser Gln Pro Pro Val Gln Val Ser 595 600 605 Val Lys Thr Gln Val Ser Val Thr Ala Ala Ile Pro His Leu Lys Thr 610 615 620 Ser Thr Leu Pro Pro Leu Pro Leu Pro Pro Leu Leu Pro Gly Gly Asp 625 630 635 640 Asp Met Asp Arg Ile Cys Cys Pro Arg Tyr Gly Glu Arg Arg Gln Thr 645 650 655 Glu Ser Asp Trp Gly Lys Arg Cys Val Asp Lys Phe Asp Ile Ile Gly 660 665 670 Ile Ile Gly Glu Gly Thr Tyr Gly Gln Val Tyr Lys Ala Arg Asp Lys 675 680 685 Asp Thr Gly Glu Leu Val Ala Leu Lys Lys Val Arg Leu Asp Asn Glu 690 695 700 Lys Glu Gly Phe Pro Ile Thr Ala Ile Arg Glu Ile Lys Ile Leu Arg 705 710 715 720 Gln Leu Ile His Arg Ser Val Val Asn Met Lys Glu Ile Val Thr Asp 725 730 735 Lys Gln Asp Ala Leu Asp Phe Lys Lys Asp Lys Gly Ala Phe Tyr Leu 740 745 750 Val Phe Glu Tyr Met Asp His Asp Leu Met Gly Leu Leu Glu Ser Gly 755 760 765 Leu Val His Phe Ser Glu Asp His Ile Lys Ser Phe Met Lys Gln Leu 770 775 780 Met Glu Gly Leu Glu Tyr Cys His Lys Lys Asn Phe Leu His Arg Asp 785 790 795 800 Ile Lys Cys Ser Asn Ile Leu Leu Asn Asn Ser Gly Gln Ile Lys Leu 805 810 815 Ala Asp Phe Gly Leu Ala Arg Leu Tyr Asn Ser Glu Glu Ser Arg Pro 820 825 830 Tyr Thr Asn Lys Val Ile Thr Leu Trp Tyr Arg Pro Pro Glu Leu Leu 835 840 845 Leu Gly Glu Glu Arg Tyr Thr Pro Ala Ile Asp Val Trp Ser Cys Gly 850 855 860 Cys Ile Leu Gly Glu Leu Phe Thr Lys Lys Pro Ile Phe Gln Ala Asn 865 870 875 880 Leu Glu Leu Ala Gln Leu Glu Leu Ile Ser Arg Leu Cys Gly Ser Pro 885 890 895 Cys Pro Ala Val Trp Pro Asp Val Ile Lys Leu Pro Tyr Phe Asn Thr 900 905 910 Met Lys Pro Lys Lys Gln Tyr Arg Arg Arg Leu Arg Glu Glu Phe Ser 915 920 925 Phe Ile Pro Ser Ala Ala Leu Asp Leu Leu Asp His Met Leu Thr Leu 930 935 940 Asp Pro Ser Lys Arg Cys Thr Ala Glu Gln Thr Leu Gln Ser Asp Phe 945 950 955 960 Leu Lys Asp Val Glu Leu Ser Lys Met Ala Pro Pro Asp Leu Pro His 965 970 975 Trp Gln Asp Cys His Glu Leu Trp Ser Lys Lys Arg Arg Arg Gln Arg 980 985 990 Gln Ser Gly Val Val Val Glu Glu Pro Pro Pro Ser Lys Thr Ser Arg 995 1000 1005 Lys Glu Thr Thr Ser Gly Thr Ser Thr Glu Pro Val Lys Asn Ser 1010 1015 1020 Ser Pro Ala Pro Pro Gln Pro Ala Pro Gly Lys Val Glu Ser Gly 1025 1030 1035 Ala Gly Asp Ala Ile Gly Leu Ala Asp Ile Thr Gln Gln Leu Asn 1040 1045 1050 Gln Ser Glu Leu Ala Val Leu Leu Asn Leu Leu Gln Ser Gln Thr 1055 1060 1065 Asp Leu Ser Ile Pro Gln Met Ala Gln Leu Leu Asn Ile His Ser 1070 1075 1080 Asn Pro Glu Met Gln Gln Gln Leu Glu Ala Leu Asn Gln Ser Ile 1085 1090 1095 Ser Ala Leu Thr Glu Ala Thr Ser Gln Gln Gln Asp Ser Glu Thr 1100 1105 1110 Met Ala Pro Glu Glu Ser Leu Lys Glu Ala Pro Ser Ala Pro Val 1115 1120 1125 Ile Leu Pro Ser Ala Glu Gln Met Thr Leu Glu Ala Ser Ser Thr 1130 1135 1140 Pro Ala Asp Met Gln Asn Ile Leu Ala Val Leu Leu Ser Gln Leu 1145 1150 1155 Met Lys Thr Gln Glu Pro Ala Gly Ser Leu Glu Glu Asn Asn Ser 1160 1165 1170 Asp Lys Asn Ser Gly Pro Gln Gly Pro Arg Arg Thr Pro Thr Met 1175 1180 1185 Pro Gln Glu Glu Ala Ala Ala Cys Pro Pro His Ile Leu Pro Pro 1190 1195 1200 Glu Lys Arg Pro Pro Glu Pro Pro Gly Pro Pro Pro Pro Pro Pro 1205 1210 1215 Pro Pro Pro Leu Val Glu Gly Asp Leu Ser Ser Ala Pro Gln Glu 1220 1225 1230 Leu Asn Pro Ala Val Thr Ala Ala Leu Leu Gln Leu Leu Ser Gln 1235 1240 1245 Pro Glu Ala Glu Pro Pro Gly His Leu Pro His Glu His Gln Ala 1250 1255 1260 Leu Arg Pro Met Glu Tyr Ser Thr Arg Pro Arg Pro Asn Arg Thr 1265 1270 1275 Tyr Gly Asn Thr Asp Gly Pro Glu Thr Gly Phe Ser Ala Ile Asp 1280 1285 1290 Thr Asp Glu Arg Asn Ser Gly Pro Ala Leu Thr Glu Ser Leu Val 1295 1300 1305 Gln Thr Leu Val Lys Asn Arg Thr Phe Ser Gly Ser Leu Ser His 1310 1315 1320 Leu Gly Glu Ser Ser Ser Tyr Gln Gly Thr Gly Ser Val Gln Phe 1325 1330 1335 Pro Gly Asp Gln Asp Leu Arg Phe Ala Arg Val Pro Leu Ala Leu 1340 1345 1350 His Pro Val Val Gly Gln Pro Phe Leu Lys Ala Glu Gly Ser Ser 1355 1360 1365 Asn Ser Val Val His Ala Glu Thr Lys Leu Gln Asn Tyr Gly Glu 1370 1375 1380 Leu Gly Pro Gly Thr Thr Gly Ala Ser Ser Ser Gly Ala Gly Leu 1385 1390 1395 His Trp Gly Gly Pro Thr Gln Ser Ser Ala Tyr Gly Lys Leu Tyr 1400 1405 1410 Arg Gly Pro Thr Arg Val Pro Pro Arg Gly Gly Arg Gly Arg Gly 1415 1420 1425 Val Pro Tyr 1430 

What is claimed is:
 1. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 2, and fragments thereof.
 2. An isolated nucleic acid encoding the polypeptide of claim 1, and fragments thereof.
 3. The isolated nucleic acid of claim 2, which comprises the nucleotide sequence of SEQ ID NO:
 1. 4. The isolated nucleic acid of claim 3, wherein the fragments comprise the nucleotide 1963 to 1968 of SEQ ID NO:
 1. 5. An expression vector comprising the nucleic acid of any one of claims 2 to
 5. 6. A host cell transformed with the expression vector of claim
 6. 7. A method for producing the polypeptide of claim 1, which comprises the steps of: (1) culturing the host cell of claim 6 under a condition suitable for the expression of the polypeptide; and (2) recovering the polypeptide from the host cell culture.
 8. An antibody specifically binding to the polypeptide of claim 1
 9. A method for diagnosing the diseases associated with the deficiency of human CrkRS gene in a mammal which comprises detecting the nucleic acid of any one of claims 2 to 4 or the polypeptide of claim
 1. 10. The method of claim 9, wherein the diseases are lung cancers.
 11. The method of claim 9, wherein the detection of the nucleic acid of any one claims 2 to 4 comprises the steps of: (1) extracting total RNA from a sample obtained from the mammal; (2) amplifying the RNA by reverse transcriptase-polymerase chain reaction (RT-PCR) with a pair of primers to obtain a cDNA sample comprising the nucleotides 1963 to 1968 of SEQ ID NO: 1; and (3) detecting whether the cDNA sample is obtained.
 12. The method of claim 11, wherein one of the primers has a sequence comprising the nucleotides of SEQ ID NO: 1 containing nucleotides 1963 to 1968, and the other has a sequence complementary to the nucleotides of SEQ ID NO: 1 at any other locations downstream of nucleotide 1968, or one of the primers has a sequence complementary to the nucleotides of SEQ ID NO: 1 containing nucleotides 1963 to 1968, and the other has a sequence comprising the nucleotides of SEQ ID NO: 1 at any other locations upstream of nucleotide
 1963. 13. The method of claim 11, wherein one of the primers has a sequence comprising the nucleotides of SEQ ID NO: 1 upstream of nucleotide 1964 and the other has a sequence complementary to the nucleotides of SEQ ID NO: 1 downstream of nucleotide 1965, or one of the primers has a sequence complementary to the nucleotides of SEQ ID NO: 1 upstream of nucleotide 1964 and the other has a sequence comprising the nucleotides of SEQ ID NO: 1 downstream of nucleotide
 1965. 14. The method of claim 13, wherein the cDNA sample amplified from SEQ ID NO: 1 is 177 bp shorter than the cDNA sample amplified from CrkRS.
 15. The method of claim 11 further comprising the step of detecting the amount of the amplified cDNA sample.
 16. The method of claim 9, wherein the detection of the nucleic acid of any one of claims 2 to 4 comprises the steps of: (1) extracting the total RNA of a sample obtained from the mammal; (2) amplifying the RNA by reverse transcriptase-polymerase chain reaction (RT-PCR) to obtain a cDNA sample; (3) bringing the cDNA sample into contact with the nucleic acid of any one of claims 2 to 4; and (4) detecting whether the cDNA sample hybridizes with the nucleic acid of any one of claims 2 to
 4. 17. The method of claim 16 further comprising the step of detecting the amount of hybridized sample.
 18. The method of claim 9, wherein the detection of the polypeptide of claim 1 comprises the steps of contacting the antibody of claim 8 with a protein sample obtained from the mammal, and detecting whether an antibody-polypeptide complex is formed.
 19. The method of claim 18 further comprising the step of detecting the amount of the antibody-polypeptide complex. 